Word Sense Disambiguation Using a Second Language Monolingual Corpus
نویسندگان
چکیده
This paper presents a new approach for resolving lexical ambiguities in one language using statistical data from a monolingual corpus of another language. This approach exploits the differences between mappings of words to senses in different languages. The paper concentrates on the problem of target word selection in machine translation, for which the approach is directly applicable. The presented algorithm identifies syntactic relations between words, using a source language parser, and maps the alternative interpretations of these relations to the target language, using a bilingual lexicon. The preferred senses are then selected according to statistics on lexical relations in the target language. The selection is based on a statistical model and on a constraint propagation algorithm, which simultaneously handles all ambiguities in the sentence. The method was evaluated using three sets of Hebrew and German examples and was found to be very useful for disambiguation. The paper includes a detailed comparative analysis of statistical sense disambiguation methods.
منابع مشابه
Unsupervised Monolingual and Bilingual Word-Sense Disambiguation of Medical Documents using UMLS
This paper describes techniques for unsupervised word sense disambiguation of English and German medical documents using UMLS. We present both monolingual techniques which rely only on the structure of UMLS, and bilingual techniques which also rely on the availability of parallel corpora. The best results are obtained using relations between terms given by UMLS, a method which achieves 74% prec...
متن کاملLexical Selection with a Target Language Monolingual Corpus and an MRD
In this paper, we propose a lexical selection method with three steps: sense disambiguation of source words, sense-to-word mapping, and selection of the most appropriate target language lexical item. The knowledge for each step is extracted from a machine readable dictionary and a target language monolingual corpus. By splitting the process of lexical selection into three steps and extracting t...
متن کاملWord Sense Disambiguation Using Target Language Corpus in a Machine Translation System
This article studies different aspects of a new approach to word sense disambiguation using statistical information gained from a monolingual corpus of the target language. Here, the source language is English and the target is Persian, and the disambiguation method can be directly applied in the system of English-to-Persian machine translation for solving lexical ambiguity problems in this sys...
متن کاملParaSense or How to Use Parallel Corpora for Word Sense Disambiguation
This paper describes a set of exploratory experiments for a multilingual classificationbased approach to Word Sense Disambiguation. Instead of using a predefined monolingual sense-inventory such as WordNet, we use a language-independent framework where the word senses are derived automatically from word alignments on a parallel corpus. We built five classifiers with English as an input language...
متن کاملLexical Selection with a Target
In this paper, we propose a lexical selection method with three steps: sense disambiguation of source words, sense-to-word mapping, and selection of the most appropriate target language lexical item. The knowledge for each step is extracted from a machine readable dictionary and a target language monolingual corpus. By splitting the process of lexical selection into three steps and extracting t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Computational Linguistics
دوره 20 شماره
صفحات -
تاریخ انتشار 1994